Authors: Natalia Spitz1, Alexandra Sexton Oates2, Cyrille Cuenin1, Vincent Cahais1, Alexei Novoloaca3, Terence Dwyer4, Siri Eldevik Håberg5, ‪Monica Cheng Munthe-Kaas6, Per Magnus6, Sjurdur Frodi Olsen7, ‬Richard Saffery8, Zdenko Herceg1†, Akram Ghantous1†, on behalf of The International Childhood Cancer Cohort Consortium (I4C)

Affiliation: 1 Epigenomics and Mechanisms Branch, International Agency for Research on Cancer, Lyon, France; 2 Genomic Epidemiology Branch, International Agency for Research on Cancer, Lyon, France; 3 BIOASTER, Lyon, France; 4 University of Oxford, Oxford, United Kingdom; 5 Centre for Fertility and Health, Norwegian Institute of Health, Oslo, Norway; 6 Norwegian Institute of Public Health, Oslo, Norway; 7 Department of Epidemiology Research, Statens Serum Institute, Copenhagen, Denmark; 8 Cancer and Disease Epigenetics, Murdoch Children’s Cancer Institute, Melbourne, Australia

† Equal contribution as senior authors.

Individual CpGs and DMRs meta-analysis

Individual CpGs and DMRs meta-analyses. (A) Manhattan plot summarizing the result for the individual CpGs meta-analysis of associations between childhood CNS tumors and DNA methylation at birth. The meta-analysis was done considering the common CpGs between HM850 and HM450. The red line shows the Bonferroni threshold (< 0.05) for multiple testing. Methylation sites that surpassed the Bonferroni correction threshold are highlighted in green and show the CpG name with the nearby gene it was annotated to in brackets. (B) Manhattan plot for DMRs meta-analysis showing an enrichment in DMRs on chromosome 6. Each dot represents a DMR. The red line shows the Bonferroni threshold (< 0.05) for multiple testing. The y axis represents the –log10(p-value) and the direction of the effect. (C) Bar plot depicting the percentage enrichment across CpG density and regulatory regions. The results are based on significant individual CpGs and all CpGs from significant DMRs from meta-analysis and are compared to the reference set consisting of the whole list of CpGs in common between HM850 and HM450. The asterisk (*) marks significant enrichment (p < 0.05; chi-squared test). (E) Enrichment on gene ontologies related to different brain parts. Analysis performed with Enrichr web tool taking in consideration the 1069 genes annotated to the 1687 DMRs and 2 individual CpGs. It is displayed the pathways with a Bonferroni p-value lower than 0.1. The circles color represent the Bonferroni and their size represent the gene counts, as per the figure legend. The Terms and the databases from where they were obtained are displayed in the y axis. The x axis represents the gene ratio (number of queried genes in the pathway/total number of genes in the pathway). (F) Significant DMRs (Bonferroni < 0.05) with at least one CpG with effect size > 2%. The CpGs are symbolized by circles of variant sizes and colors, representing the effect sizes and directions of effect, respectively, as per the figure legend.

Longitudinal analysis and impact of DNA methylation on RNA expression

Longitudinal analysis and impact of DNA methylation on RNA expression.(A) Longitudinal analysis on newborn blood to assess stability of DNA methylation over time in independent samples. A Spearman correlation was done for each significant CpGs between the DNA methylation and the year of birth of each sample. Each dot represents one sample and are coloured according to the dataset as figure legend. The rs and p.adjust represents, respectively, the spearman coefficient and the Bonferroni adjusted p-value from the correlation analysis. (B) Longitudinal case-case analysis showing no correlation between DNA methylation at blood and tumor tissue in Melbourne paired samples (n=25). The x-axis represents the DNA methylation on newborn blood, and the y-axis represents the DNA methylation on tumor tissue. The rs and p.adjust represents, respectively, the spearman coefficient and the Bonferroni adjusted p-value from the correlation analysis. (C) Comparison of DNA methylation level in newborn blood (Control and Case) and brain (healthy and tumor tissue). The CpG on the BMF gene showed significant difference (Bonferroni<0.05) between cases and controls and same direction of effect in both blood and brain tissue. Each dot represents one sample, and the red dot represents the mean. Error bar showing the 95%CI is displayed. The asterisk (*) between Control and Case (newborn blood) represents the Bonferroni significance from meta-analysis, while an asterisk between Healthy Brain and Tumor indicates Bonferroni adjusted p-value <0.05 in an independent-sample t-test. (D) No functional impact of the DNA methylome on RNA expression in the same gene in blood of healthy children investigated by a linear regression on independent set of samples (Gambia cohort, n=93). Association between DNA methylome and RNA expression of the same gene in a childhood medulloblastoma (E) tissue (GEO accession = GSE85218, n = 538) and pilocytic astrocytoma (F) tissue (GEO accession methylation: GSE44684 and expression: GSE44971, n = 38). The p.adjust represents the Bonferroni adjusted p-value from the linear regression.

Technical validation and survival analysis on Melbourne data set

Technical validation and survival analysis on Melbourne data set. Technical validation by pyrosequencing of the DNA methylation levels of the CpGs on BMF (A) and UNC119B (B) genes. Strong Pearson correlation between DNA methylation levels detected by HM850 and pyrosequencing (n=38). R = Pearson correlation coefficient, p = p-value. (C) Kaplan-Meier curve for 2-year overall survival (OS) based on the methylation of CpG BMF gene. Censored data are indicated by cross marks. Hyper = methylation above the mean, hypo = methylation below the mean, p = log-rank p-value. (D) Forest plot of multivariate cox regression analysis of BMF methylation OS after including cancer type, sex, and age at diagnosis. The squares represent the Hazard Ratio (HR) of each variable, and the horizontal lines show the 95 % confidence intervals (CI). meth_group= samples separated by methylation levels above (hyper) or below the mean (hypo), AgAtDx = age at diagnosis. (E) Kaplan-Meier curve for 2-year relapse-free survival (RFS) based on the methylation of BMF. Censored data are indicated by cross marks. Hyper = methylation above the mean, hypo = methylation below the mean, p = log-rank p-value. (F) Forest plot of multivariate cox regression analysis of BMF methylation RFS after including cancer type, sex, and age at diagnosis results. The squares represent the HR of each variable, and the horizontal lines show the 95 % CI. A HR < 1 indicates reduced hazard of relapse whereas a HR > 1 indicates an increased hazard of relapse. meth_group = samples separated by methylation levels above (hyper) or below the mean (hypo), AgAtDx = age at diagnosis.